Highlighting nonlinear patterns in population genetics datasets

نویسندگان

  • Gregorio Alanis-Lobato
  • Carlo Vittorio Cannistraci
  • Anders Eriksson
  • Andrea Manica
  • Timothy Ravasi
چکیده

Detecting structure in population genetics and case-control studies is important, as it exposes phenomena such as ecoclines, admixture and stratification. Principal Component Analysis (PCA) is a linear dimension-reduction technique commonly used for this purpose, but it struggles to reveal complex, nonlinear data patterns. In this paper we introduce non-centred Minimum Curvilinear Embedding (ncMCE), a nonlinear method to overcome this problem. Our analyses show that ncMCE can separate individuals into ethnic groups in cases in which PCA fails to reveal any clear structure. This increased discrimination power arises from ncMCE's ability to better capture the phylogenetic signal in the samples, whereas PCA better reflects their geographic relation. We also demonstrate how ncMCE can discover interesting patterns, even when the data has been poorly pre-processed. The juxtaposition of PCA and ncMCE visualisations provides a new standard of analysis with utility for discovering and validating significant linear/nonlinear complementary patterns in genetic data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extreme patterns of variance in small populations: placing limits on human Y-chromosome diversity through time in the Vanuatu Archipelago.

Small populations are dominated by unique patterns of variance, largely characterized by rapid drift of allele frequencies. Although the variance components of genetic datasets have long been recognized, most population genetic studies still treat all sampling locations equally despite differences in sampling and effective population sizes. Because excluding the effects of variance can lead to ...

متن کامل

Genomic Patterns of Geographic Differentiation in Drosophila simulans.

Geographic patterns of genetic differentiation have long been used to understand population history and to learn about the biological mechanisms of adaptation. Here we present an examination of genomic patterns of differentiation between northern and southern populations of Australian and North American Drosophila simulans, with an emphasis on characterizing signals of parallel differentiation....

متن کامل

Integrative testing of how environments from the past to the present shape genetic structure across landscapes.

Tests of the genetic structure of empirical populations typically focus on the correlative relationships between population connectivity and geographic and/or environmental factors in landscape genetics. However, such tests may overlook or misidentify the impact of candidate factors on genetic structure, especially when connectivity patterns differ between past and present populations because o...

متن کامل

The Spatial Mixing of Genomes in Secondary Contact Zones.

Recent genomic studies have highlighted the important role of admixture in shaping genome-wide patterns of diversity. Past admixture leaves a population genomic signature of linkage disequilibrium (LD), reflecting the mixing of parental chromosomes by segregation and recombination. These patterns of LD can be used to infer the timing of admixture, but the results of inference can depend strongl...

متن کامل

Using uniformat and gene[rate] to Analyze Data with Ambiguities in Population Genetics.

Some genetic systems frequently present ambiguous data that cannot be straightforwardly analyzed with common methods of population genetics. Two possibilities arise to analyze such data: one is the arbitrary simplification of the data and the other is the development of methods adapted to such ambiguous data. In this article, we present an attempt at such a development, the uniformat grammar an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2015